Sessionization –A Vital Stage in Data Preprocessing of Web Usage Mining-A Survey

نویسندگان

  • Dharmendra Patel
  • Kalpesh Parikh
  • Atul Patel
چکیده

The World Wide Web has impacted on almost ever aspects of our lives in modern era. The Web has many unique characteristics and which make mining useful information and knowledge a challenging task. Web mining uses many data mining techniques but it is not an application of traditional data mining due to heterogeneity and unstructured nature of the data on Web. Web mining tasks can be categorized into three types: Web Structure Mining,. Web Content Mining and Web Usage Mining. The goal of Web Usage Mining is to capture, model and analyze the behavioral patterns and profiles of users interacting with a Web Site. Web Usage Mining consists of many stages but this paper focuses on first stage i.e data preprocessing. Data Preprocessing consists of data cleaning, page view identification, sessionization, data integration and data transformation. This paper focuses on most complex part of data preprocessing and that is Sessionization.This paper covers many important aspects of sessionization stage which are very useful for research scholars who are doing research work in Web Usage mining field.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Traversal Pattern Mining in Web Usage Data

Web usage mining is to discover useful patterns in the web usage data, and the patterns provide useful information about the user’s browsing behavior. This chapter examines different types of web usage traversal patterns and the related techniques used to uncover them, including Association Rules, Sequential Patterns, Frequent Episodes, Maximal Frequent Forward Sequences, and Maximal Frequent S...

متن کامل

A Novel Semantically-Time-Referrer based Approach of Web Usage Mining for Improved Sessionization in Pre-Processing of Web Log

Web usage mining(WUM) , also known as Web Log Mining is the application of Data Mining techniques, which are applied on large volume of data to extract useful and interesting user behaviour patterns from web logs, in order to improve web based applications. This paper aims to improve the data discovery by mining the usage data from log files. In this paper the work is done in three phases. Firs...

متن کامل

A Novel Technique for Sessions Identification in Web Usage Mining Preprocessing

The growth of World Wide Web is incredible as it can be seen in present days. Users find it very difficult to extract useful and relevant information from the huge amount of information. The problems can be solved by Web Usage Mining which involves preprocessing, pattern discovery and pattern analysis. Preprocessing is an important process which converts raw web log data into transactions. Appl...

متن کامل

A Survey on Preprocessing Methods for Web Usage Data

World Wide Web is a huge repository of web pages and links. It provides abundance of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Because of the tremendous usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the applicati...

متن کامل

Semantic Preprocessing of Web Request Streams for Web Usage Mining

Efficient data preparation needs to discover the underlying knowledge from complicated Web usage data. In this paper, we have focused on two main tasks, semantic outlier detection from online Web request streams and segmentation (or sessionization) of them. We thereby exploit semantic technologies to infer the relationships among Web requests. Web ontologies such as taxonomies and directories c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012